Skip to content

Fix migrate_sonic_packages() crash on symlink resolv.conf#4365

Merged
yxieca merged 1 commit intosonic-net:masterfrom
william8545:installer-check-symlink-public
Mar 20, 2026
Merged

Fix migrate_sonic_packages() crash on symlink resolv.conf#4365
yxieca merged 1 commit intosonic-net:masterfrom
william8545:installer-check-symlink-public

Conversation

@william8545
Copy link
Copy Markdown
Contributor

What I did

Fixed sonic-installer install failing during migrate_sonic_packages() when /etc/resolv.conf in the new image is a symlink to /run/resolvconf/resolv.conf.

The failure occurs because the cp command at main.py:386 follows the symlink through the overlay mount. Since the symlink target is an absolute path, it resolves to the host's /run/resolvconf/resolv.conf — the same file as the source. cp detects same source and destination inode and exits with:

cp: '/etc/resolv.conf' and '/tmp/image-<version>-fs/etc/resolv.conf' are the same file

This was introduced by the build_debian.sh change that replaced touch with ln -sf /run/resolvconf/resolv.conf for /etc/resolv.conf in the image filesystem.

How I did it

Check whether /etc/resolv.conf in the chroot is a symlink or a regular file, and handle each case appropriately:

  • Symlink (images with resolvconf package installed): Read the symlink target via readlink (e.g. /run/resolvconf/resolv.conf), then create the target file inside the chroot with the host's DNS content. The symlink then resolves correctly inside the chroot. This avoids touching the symlink itself, so the overlay upper dir's etc/resolv.conf is never modified and the new image boots with the symlink intact. No cleanup is needed — the target lives under /run, which is a tmpfs recreated at every boot.

  • Regular file (images without resolvconf, or where the build process explicitly creates a regular file via touch): Overwrite directly with the host's DNS content. No backup/restore is needed — the original file is empty (cleared during build), and after reboot the resolv-config service reconfigures DNS from CONFIG_DB.

The previous backup-overwrite-restore pattern has been removed since it is unnecessary in both cases.

How to verify it

  1. Start with a switch running an image where /etc/resolv.conf is a symlink:

    # Confirm symlink exists
    ls -la /etc/resolv.conf
    # Expected: /etc/resolv.conf -> /run/resolvconf/resolv.conf
    
    # Ensure only one image is installed (clean state)
    sudo sonic-installer list
    # If the target image is already present, remove it:
    sudo sonic-installer remove <target-image> -y
  2. Run sonic-installer install with an image that also has the symlink:

    sudo sonic-installer install <image-path> -y
  3. Verify:

    • Installation completes without cp: ... are the same file error
    • sonic-installer list shows the new image as default
    • After reboot, /etc/resolv.conf is still a symlink to /run/resolvconf/resolv.conf
    • DNS resolution works

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

When /etc/resolv.conf in the new image is a symlink (e.g. -> /run/resolvconf/resolv.conf), the cp command follows it through the overlay mount. The absolute target path resolves to the host's file, causing "cp: are the same file" error.

Detect symlinks and populate the target path inside the chroot instead of copying over the symlink. For regular files, overwrite directly with host DNS content. Backup and restore are removed since the symlink target lives under /run (tmpfs, recreated at boot) and the regular file case overwrites an empty file that will be reconfigured by resolv-config service after reboot.

Signed-off-by: William Tsai <willtsai@nvidia.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vmittal-msft
Copy link
Copy Markdown
Contributor

@saiarcot895 please help approve

Copy link
Copy Markdown
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Handles symlinked /etc/resolv.conf correctly and updates tests. AI agent on behalf of Ying.

@yxieca yxieca merged commit fb3d73d into sonic-net:master Mar 20, 2026
9 checks passed
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202511: #4380

mssonicbld added a commit to mssonicbld/sonic-buildimage that referenced this pull request Apr 2, 2026
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Dependency
This PR depends on sonic-net/sonic-utilities#4365. The other PR should be merged first before this one can be merged.

#### Why I did it

After installing SONiC 202511 from ONIE, 10 out of 15 docker containers have empty `/etc/resolv.conf` and no DNS resolution. This is a regression from 202412.

The Trixie base image upgrade introduced two lines in `build_debian.sh` that destroy the `/etc/resolv.conf` symlink (created by the `resolvconf` package) and replace it with a regular empty file:

```bash
sudo rm -f $FILESYSTEM_ROOT/etc/resolv.conf
sudo touch $FILESYSTEM_ROOT/etc/resolv.conf
```

This breaks the DNS propagation chain to docker containers because `/etc/resolvconf/update.d/libc` checks whether `/etc/resolv.conf` is a symlink to `/run/resolvconf/resolv.conf` before notifying downstream consumers (including `update-libc.d/update-containers`). When the symlink is missing, DHCP-obtained DNS is never propagated to containers.

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it

Replaced `sudo touch` with `sudo ln -sf /run/resolvconf/resolv.conf` to preserve the symlink that the `resolvconf` package expects:

```bash
sudo rm -f $FILESYSTEM_ROOT/etc/resolv.conf
sudo ln -sf /run/resolvconf/resolv.conf $FILESYSTEM_ROOT/etc/resolv.conf
```

This is consistent with what `resolv-config.sh` does at runtime (`ln -sf /run/resolvconf/resolv.conf /etc/resolv.conf`) and matches the behavior of all SONiC releases prior to 202511.

#### How to verify it

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

1. Install from ONIE on a switch
2. After boot, verify:
   ```bash
   # Host resolv.conf should be a symlink
   ls -la /etc/resolv.conf
   # Expected: /etc/resolv.conf -> /run/resolvconf/resolv.conf

   # All containers should have DNS
   for c in $(docker ps --format '{{.Names}}'); do
     echo "=== $c ==="
     docker exec $c cat /etc/resolv.conf
   done
   ```

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [ ] 202505
- [x] 202511

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

Signed-off-by: Sonic Build Admin <sonicbld@microsoft.com>

#### A picture of a cute animal (not mandatory but encouraged)
mssonicbld added a commit to sonic-net/sonic-buildimage that referenced this pull request Apr 3, 2026
#26535)

<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "fixes #xxxx", or
 "closes #xxxx" or "resolves #xxxx"

 Please provide the following information:
-->

#### Dependency
This PR depends on sonic-net/sonic-utilities#4365. The other PR should be merged first before this one can be merged.

#### Why I did it

After installing SONiC 202511 from ONIE, 10 out of 15 docker containers have empty `/etc/resolv.conf` and no DNS resolution. This is a regression from 202412.

The Trixie base image upgrade introduced two lines in `build_debian.sh` that destroy the `/etc/resolv.conf` symlink (created by the `resolvconf` package) and replace it with a regular empty file:

```bash
sudo rm -f $FILESYSTEM_ROOT/etc/resolv.conf
sudo touch $FILESYSTEM_ROOT/etc/resolv.conf
```

This breaks the DNS propagation chain to docker containers because `/etc/resolvconf/update.d/libc` checks whether `/etc/resolv.conf` is a symlink to `/run/resolvconf/resolv.conf` before notifying downstream consumers (including `update-libc.d/update-containers`). When the symlink is missing, DHCP-obtained DNS is never propagated to containers.

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it

Replaced `sudo touch` with `sudo ln -sf /run/resolvconf/resolv.conf` to preserve the symlink that the `resolvconf` package expects:

```bash
sudo rm -f $FILESYSTEM_ROOT/etc/resolv.conf
sudo ln -sf /run/resolvconf/resolv.conf $FILESYSTEM_ROOT/etc/resolv.conf
```

This is consistent with what `resolv-config.sh` does at runtime (`ln -sf /run/resolvconf/resolv.conf /etc/resolv.conf`) and matches the behavior of all SONiC releases prior to 202511.

#### How to verify it

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

1. Install from ONIE on a switch
2. After boot, verify:
 ```bash
 # Host resolv.conf should be a symlink
 ls -la /etc/resolv.conf
 # Expected: /etc/resolv.conf -> /run/resolvconf/resolv.conf

 # All containers should have DNS
 for c in $(docker ps --format '{{.Names}}'); do
 echo "=== $c ==="
 docker exec $c cat /etc/resolv.conf
 done
 ```

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [ ] 202505
- [x] 202511

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

Signed-off-by: Sonic Build Admin <sonicbld@microsoft.com>

#### A picture of a cute animal (not mandatory but encouraged)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants